元加强学习(META-RL)是一种方法,即从解决各种任务中获得的经验被蒸馏成元政策。当仅适应一个小(或仅一个)数量的步骤时,元派利赛能够在新的相关任务上近距离执行。但是,采用这种方法来解决现实世界中的问题的主要挑战是,它们通常与稀疏的奖励功能相关联,这些功能仅表示任务是部分或完全完成的。我们考虑到某些数据可能由亚最佳代理生成的情况,可用于每个任务。然后,我们使用示范(EMRLD)开发了一类名为“增强元RL”的算法,即使在训练过程中获得了次优的指导,也可以利用此信息。我们展示了EMRLD如何共同利用RL和在离线数据上进行监督学习,以生成一个显示单调性能改进的元数据。我们还开发了一个称为EMRLD-WS的温暖开始的变体,该变体对于亚最佳演示数据特别有效。最后,我们表明,在包括移动机器人在内的各种稀疏奖励环境中,我们的EMRLD算法显着优于现有方法。
translated by 谷歌翻译
我们研究了安全在线凸优化的问题,其中每个时间步长的动作必须满足一组线性安全约束。目标是选择一系列动作,以最小化遗憾,而不会在任何时间步骤(具有高概率)时违反安全约束。指定线性安全约束的参数对算法未知。该算法只能访问所选择的操作的约束的嘈杂观察。我们提出了一种算法,称为{Safe Online投影梯度下降}(SO-PGD)算法,以解决这个问题。我们表明,在假设安全基线动作的可用性的假设下,所以PGD算法实现了遗憾$ O(t ^ {2/3})$。虽然在线凸优化(OCO)存在许多用于文献中的安全约束的算法,但它们允许在学习/优化期间违反限制,并且重点是表征累积约束违规。据我们所知,我们的是第一项工作,提供了一个遗憾的算法,而无需在任何时间步骤违反线性安全约束(具有高概率)。
translated by 谷歌翻译
仿制学习(IL)是在连续控制环境中的流行方法,如其他原因,它避免了加固学习中奖励错误规范和探索的问题(RL)。在IL的示威中,一个重要的挑战是获得对投入顺利进行的代理政策。通过模仿作为一个稳定的函数来学习,这是一种顺利的策略($ S $-$)空间(典型的高维连续控制环境)可能是具有挑战性的。我们采取了第一步迈出了通过使用\ Texit {两者}的策略和成本模型来解决这个问题的解决这个问题。我们的符合方案通过确保成本函数以受控方式变化为$ S $的函数 - $空间;而代理政策对国家空间良好表现得很好。我们称之为新的顺利IL算法\«Spoolly Policy和Cost Imitation Learning}(Spacil,Pronoughce'Special')。我们介绍了一种新的指标来量化学习政策的顺利。我们展示了Spacil在Mujoco的连续控制任务方面的卓越性能。该算法不仅优于我们所提出的平滑度指标的最先进的IL算法,但是,享有更快的学习和大幅更高的平均回报的增加的好处。
translated by 谷歌翻译
t-SNE remains one of the most popular embedding techniques for visualizing high-dimensional data. Most standard packages of t-SNE, such as scikit-learn, use the Barnes-Hut t-SNE (BH t-SNE) algorithm for large datasets. However, existing CPU implementations of this algorithm are inefficient. In this work, we accelerate the BH t-SNE on CPUs via cache optimizations, SIMD, parallelizing sequential steps, and improving parallelization of multithreaded steps. Our implementation (Acc-t-SNE) is up to 261x and 4x faster than scikit-learn and the state-of-the-art BH t-SNE implementation from daal4py, respectively, on a 32-core Intel(R) Icelake cloud instance.
translated by 谷歌翻译
Position modeling plays a critical role in Transformers. In this paper, we focus on length extrapolation, i.e., training on short texts while evaluating longer sequences. We define attention resolution as an indicator of extrapolation. Then we propose two designs to improve the above metric of Transformers. Specifically, we introduce a relative position embedding to explicitly maximize attention resolution. Moreover, we use blockwise causal attention during inference for better resolution. We evaluate different Transformer variants with language modeling. Experimental results show that our model achieves strong performance in both interpolation and extrapolation settings. The code will be available at https://aka.ms/LeX-Transformer.
translated by 谷歌翻译
A true interpreting agent not only understands sign language and translates to text, but also understands text and translates to signs. Much of the AI work in sign language translation to date has focused mainly on translating from signs to text. Towards the latter goal, we propose a text-to-sign translation model, SignNet, which exploits the notion of similarity (and dissimilarity) of visual signs in translating. This module presented is only one part of a dual-learning two task process involving text-to-sign (T2S) as well as sign-to-text (S2T). We currently implement SignNet as a single channel architecture so that the output of the T2S task can be fed into S2T in a continuous dual learning framework. By single channel, we refer to a single modality, the body pose joints. In this work, we present SignNet, a T2S task using a novel metric embedding learning process, to preserve the distances between sign embeddings relative to their dissimilarity. We also describe how to choose positive and negative examples of signs for similarity testing. From our analysis, we observe that metric embedding learning-based model perform significantly better than the other models with traditional losses, when evaluated using BLEU scores. In the task of gloss to pose, SignNet performed as well as its state-of-the-art (SoTA) counterparts and outperformed them in the task of text to pose, by showing noteworthy enhancements in BLEU 1 - BLEU 4 scores (BLEU 1: 31->39; ~26% improvement and BLEU 4: 10.43->11.84; ~14\% improvement) when tested on the popular RWTH PHOENIX-Weather-2014T benchmark dataset
translated by 谷歌翻译
Workloads in modern cloud data centers are becoming increasingly complex. The number of workloads running in cloud data centers has been growing exponentially for the last few years, and cloud service providers (CSP) have been supporting on-demand services in real-time. Realizing the growing complexity of cloud environment and cloud workloads, hardware vendors such as Intel and AMD are increasingly introducing cloud-specific workload acceleration features in their CPU platforms. These features are typically targeted towards popular and commonly-used cloud workloads. Nonetheless, uncommon, customer-specific workloads (unknown workloads), if their characteristics are different from common workloads (known workloads), may not realize the potential of the underlying platform. To address this problem of realizing the full potential of the underlying platform, we develop a machine learning based technique to characterize, profile and predict workloads running in the cloud environment. Experimental evaluation of our technique demonstrates good prediction performance. We also develop techniques to analyze the performance of the model in a standalone manner.
translated by 谷歌翻译
With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group. Given the lack of clean training data, generative adversarial techniques are preferred to generate synthetic data with several state-of-the-art architectures readily available across various domains from unstructured data such as text, images to structured datasets modelling fraud detection and many more. These techniques overcome several challenges such as class imbalance, limited training data, restricted access to data due to privacy issues. Existing work focusing on generating fair data either works for a certain GAN architecture or is very difficult to tune across the GANs. In this paper, we propose a pipeline to generate fairer synthetic data independent of the GAN architecture. The proposed paper utilizes a pre-processing algorithm to identify and remove bias inducing samples. In particular, we claim that while generating synthetic data most GANs amplify bias present in the training data but by removing these bias inducing samples, GANs essentially focuses more on real informative samples. Our experimental evaluation on two open-source datasets demonstrates how the proposed pipeline is generating fair data along with improved performance in some cases.
translated by 谷歌翻译
我们提出SERP,这是3D点云的自我监督学习的框架。 SERP由编码器编码器架构组成,该体系结构将被扰动或损坏的点云作为输入和旨在重建原始点云而无需损坏。编码器在低维子空间中学习了点云的高级潜在表示,并恢复原始结构。在这项工作中,我们使用了基于变压器和基于点网的自动编码器。所提出的框架还解决了基于变形金刚的掩盖自动编码器的一些局限性,这些框架容易泄漏位置信息和不均匀的信息密度。我们在完整的Shapenet数据集上训练了模型,并将它们作为下游分类任务评估。我们已经表明,审慎的模型比从头开始训练的网络实现了0.5-1%的分类精度。此外,我们还提出了VASP:对矢量定量的自动编码器,用于对点云进行自我监督的表示学习,这些学习用于基于变压器的自动编码器的离散表示学习。
translated by 谷歌翻译
通常使用卷积神经网络(CNN)进行计算机视觉。 CNN是计算密集型的,并且在移动和互联网(IoT)设备等电力控制系统上部署。 CNN是计算密集型的,因为它们不加选择地计算输入图像的所有像素上的许多特征。我们观察到,鉴于计算机视觉任务,图像通常包含与任务无关的像素。例如,如果任务正在寻找汽车,那么天空中的像素不是很有用。因此,我们建议对CNN进行修改以仅在相关像素上操作以节省计算和能量。我们提出了一种研究三个流行的计算机视觉数据集的方法,发现48%的像素无关紧要。我们还提出了重点卷积,以修改CNN的卷积层,以拒绝明显无关的像素。在嵌入式设备上,我们没有观察到准确性的损失,而推论潜伏期,能耗和倍增add计数均减少了约45%。
translated by 谷歌翻译